Consensus Based Ensembles of Soft Clusterings

نویسندگان

  • Kunal Punera
  • Joydeep Ghosh
چکیده

Cluster Ensembles is a framework for combining multiple partitionings obtained from separate clustering runs into a final consensus clustering. This framework has attracted much interest recently because of its numerous practical applications, and a variety of approaches including Graph Partitioning, Maximum Likelihood, Genetic algorithms, and Voting-Merging have been proposed. The vast majority of these approaches accept hard clusterings as input. There are, however, many clustering algorithms such as EM and fuzzy c-means that naturally output soft partitionings of data, and forcibly hardening these partitions before obtaining a consensus potentially involves loss of valuable information. In this paper we propose several consensus algorithms that work on soft clusterings and experiment with many real-life datasets to empirically show that using soft clusterings as input does offer significant advantages, especially when dealing with vertically partitioned data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cluster Ensembles for High Dimensional Clustering: An Empirical Study

This paper studies cluster ensembles for high dimensional data clustering. We examine three different approaches to constructing cluster ensembles. To address high dimensionality, we focus on ensemble construction methods that build on two popular dimension reduction techniques, random projection and principal component analysis (PCA). We present evidence showing that ensembles generated by ran...

متن کامل

Combinación de clusterizadores difusos mediante voto posicional para clustering robusto de documentos

The combination of multiple clustering processes provides a means for building robust document clustering systems. This work focuses on the consolidation of fuzzy clusterings, proposing two consensus functions for soft cluster ensembles based on the Borda and Condorcet positional voting strategies. Experiments conducted on two document corpora reveal that the proposed soft consensus functions a...

متن کامل

انتخاب اعضای ترکیب در خوشه‌بندی ترکیبی با استفاده از رأی‌گیری

Clustering is the process of division of a dataset into subsets that are called clusters, so that objects within a cluster are similar to each other and different from objects of the other clusters. So far, a lot of algorithms in different approaches have been created for the clustering. An effective choice (can combine) two or more of these algorithms for solving the clustering problem. Ensemb...

متن کامل

A CLUE for CLUster Ensembles

Cluster ensembles are collections of individual solutions to a given clustering problem which are useful or necessary to consider in a wide range of applications. The R package ̃clue provides an extensible computational environment for creating and analyzing cluster ensembles, with basic data structures for representing partitions and hierarchies, and facilities for computing on these, including...

متن کامل

Spatially-Aware Comparison and Consensus for Clusterings

This paper proposes a new distance metric between clusterings that incorporates information about the spatial distribution of points and clusters. Our approach builds on the idea of a Hilbert space-based representation of clusters as a combination of the representations of their constituent points. We use this representation and the underlying metric to design a spatially-aware consensus cluste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Applied Artificial Intelligence

دوره 22  شماره 

صفحات  -

تاریخ انتشار 2007